An AI-powered cybersecurity data analysis and summarization system that processes nested JSON data from Censys scans using multi-agent orchestration.
- Multi-Agent Architecture: Specialized agents for summarization, validation, and analysis
- Data Processing: Handles nested JSON structures with preprocessing
- Parallel Processing: Concurrent agent execution for improved performance
- Real-time Validation: Quality assurance with automated feedback
- Model Comparison: Side-by-side analysis of different AI models for debugging and development
- Extensible Design: Easy to add new data types and agents
- Host Scan Data: IP addresses, services, vulnerabilities, threat intelligence
- Certificate Data: SSL/TLS certificates, validation status, security analysis
- Summarization Agent: Generates structured security summaries
- Validation Agent: Validates summary quality and completeness
- Analysis Agent: Performs deep trend analysis and threat intelligence
- Orchestrator: Manages multi-agent workflows and parallelization
- Python 3.9+
- 16GB+ RAM (recommended for larger models)
- Ollama with supported models
git clone https://github.com/smith478/agentic-summarization.git
cd agentic-summarizationThis project uses uv for package management. To create a virtual environment and install the required dependencies, run the following commands:
uv venv
source .venv/bin/activateuv pip install -r requirements.txt# Install Ollama (visit https://ollama.ai for platform-specific instructions)
curl -fsSL https://ollama.ai/install.sh | sh
# Pull required models
ollama pull qwen3:8b
ollama pull gpt-oss:20b
ollama pull gemma3:latest
ollama pull gemma3:270mmkdir data
# Copy your JSON data files to the data directory
cp hosts_dataset.json data/
cp web_properties_dataset.json data/python main.pyThe API will be available at http://localhost:8000
streamlit run app.pyAccess the UI at http://localhost:8501
curl http://localhost:8000/healthimport requests
# Prepare data
data = {
"data": [{"ip": "192.168.1.1", "services": [...]}],
"model": "qwen3:8b",
"data_type": "hosts"
}
# Generate summary
response = requests.post("http://localhost:8000/summarize", json=data)
summary = response.json()comparison_data = {
"data": your_data,
"model1": "qwen3:8b",
"model2": "gpt-oss:20b",
"data_type": "hosts"
}
response = requests.post("http://localhost:8000/compare", json=comparison_data)analysis_data = {
"data": your_data,
"model": "gpt-oss:20b",
"data_type": "certificates"
}
response = requests.post("http://localhost:8000/analyze", json=analysis_data)┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Data Input │───▶│ Preprocessor │───▶│ Orchestrator │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│
┌───────────────────────────────┼───────────────────────────────┐
│ │ │
▼ ▼ ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Summarization │ │ Validation │ │ Analysis │
│ Agent │ │ Agent │ │ Agent │
└─────────────────┘ └─────────────────┘ └─────────────────┘
- Input Processing: JSON structures are preprocessed and formatted
- Agent Orchestration: Work is distributed across specialized agents
- Parallel Execution: Multiple agents run concurrently for performance
- Quality Validation: Automated validation ensures output quality
- Result Aggregation: Final results are compiled and formatted
{
"ip": "192.168.1.1",
"location": {
"country": "US",
"city": "New York"
},
"services": [
{
"port": 80,
"protocol": "HTTP",
"vulnerabilities": [...]
}
],
"threat_intelligence": {
"risk_level": "high"
}
}{
"domains": ["example.com"],
"subject": {
"common_name": "example.com"
},
"issuer": {
"organization": "Let's Encrypt"
},
"validity_period": {
"status": "active"
},
"security_analysis": {
"risk_level": "low"
}
}Sample test files are provided in data/:
hosts_dataset.json: Host scan dataweb_properties_dataset.json: Certificate data
Edit main.py to configure default models:
DEFAULT_MODELS = {
"summarization": "gpt-oss:20b",
"validation": "qwen3:8b",
"analysis": "qwen3:8b"
}Adjust AgentOrchestrator parameters:
orchestrator = AgentOrchestrator(
max_workers=4, # Concurrent agent limit
timeout=120, # Request timeout
max_retries=3 # Retry attempts
)Generate structured summary
{
"data": [...],
"model": "qwen3:8b",
"data_type": "hosts"
}Compare two models
{
"data": [...],
"model1": "qwen3:8b",
"model2": "gpt-oss:20b",
"data_type": "hosts"
}Perform deep analysis
{
"data": [...],
"model": "gpt-oss:20b",
"data_type": "certificates"
}System health check
List available models
Clear agent cache
# Check Ollama status
ollama list
# Restart Ollama service
systemctl restart ollama # Linux
brew services restart ollama # macOS- Reduce
max_workersin orchestrator - Use smaller models (gemma3:270m or qwen3:8b instead of gpt-oss:20b)
- Process data in smaller batches
# Verify model availability
ollama list
# Re-pull models if needed
ollama pull qwen3:8b- Use appropriate model sizes for your hardware
- Enable caching for repeated operations
- Process similar data types in batches
- Monitor system resources during processing
- You have
uvinstalled and configured on your system. - You have Ollama installed and running.
- The Censys data is located in the
data/directory. - The input data will have the same structure in the future. A more flexible approach would be better if this is not the case.
- Core Security Knowledge: Work with SMEs to craft the base prompts and get feedback.
- General Clean up: Application is currently a bit slow and a bit buggy.
- Model Exploration: Experiment with smaller models to explore the tradeoffs between models that are cheaper to run and lower latency for the user vs more expensive but potentially higher accuracy (i.e. better summarizations)
- API Model Use Test API model use (e.g. Gemini, GPT, Claude) for cost and performance. We could also have fallbacks (e.g. defaults to self hosted but falls back to API or vice versa). On a laptop self hosted models are very slow.
- Add a True FE Full fledged FE (e.g. JS or Vite or Svelte) in place of Streamlit where BE is a fast API service. I seperated most of the BE functionality into a FastAPI service and left most of the logic out of the Streamlit
app.pyfile to be closer to what would be required for a non-streamlit FE. - BE Model Service Optimization: Use Ray for parallelization, use vLLM for model inference service (this is far more performant than Ollama)
- Project Portability: Dockerize the project and add K8s docker orchestration
- Testing: Add Unit/integration testing
- Custom Models: Fine tune model(s). This requires dev time but has the potential to use much smaller model (e.g. gemma3:270m scale model) which would cut cost and latency. For training data, if there is none to start with, we could consider using a large model and tweak that until it has high performance and then use that synthetic data to train the smaller model.
- Model as a Judge: Add language model as a judge to compare different outputs.
- Cacheing: Use cacheing with the prompts to reduce cost and latency with the model.
- Additional Data Types: Support for more Censys data formats
- Advanced Analytics: Machine learning-based threat prediction
- Custom Models: Integration with fine-tuned security models
- Real-time Processing: Streaming data analysis capabilities
- Export Features: PDF/Word report generation
- Webhook Support: Automated notifications and integrations (including MCP)
- Custom Agents: Implement
BaseAgentfor specialized analysis - Data Processors: Add new
DataPreprocessormethods - Prompt Templates: Extend
PromptTemplatesfor new formats - Validation Rules: Custom validation criteria and scoring
Here is an example of the analysis and output of the application:
Generated Summary for hosts_dataset.json with gpt-oss:20b:
🛡️ EXECUTIVE SUMMARY All three hosts expose critical SSH services with the CVE‑2023‑38408 vulnerability (CVSS 9.8) and, on two hosts, the high‑severity CVE‑2024‑6387 (CVSS 8.1). Host 2 is compromised with Cobalt Strike C2 activity, indicating active exploitation. Overall risk level: CRITICAL – immediate remediation is required to prevent lateral movement and data exfiltration. Key security concerns requiring immediate attention
Unpatched SSH services vulnerable to remote code execution.
Active Cobalt Strike presence on Host 2.
Exposed administrative interfaces on non‑standard ports (11558, 8082).
📊 INFRASTRUCTURE OVERVIEW
Total Hosts Analyzed: 3
Geographic Spread: China, United States
High‑Risk Assets: 3 hosts (all identified as HIGH or CRITICAL)
Service Diversity: 10 unique services (SSH, HTTP, FTP, MySQL)
🎯 CRITICAL SECURITY FINDINGS
🔍 Suspicious Activities SSH on non‑standard port 11558 (Host 1) – unusual exposure. HTTP 401 Unauthorized on port 8082 (Host 2) – potential admin panel. HTTP 403 Forbidden on port 888 (Host 3) – possible restricted resource. FTP with TLS (Host 3) – while TLS is enabled, the service is still exposed to the internet. Cobalt Strike on Host 2 – indicates backdoor and remote control. Actionable Insight: Disable or restrict access to non‑essential ports; conduct a full audit of HTTP endpoints for hidden admin interfaces.
🚪 Attack Surface Analysis Exposed administrative interfaces: SSH on ports 11558, 22, 22 (all hosts). HTTP 8082 (Host 2) – likely admin. Weak authentication mechanisms: No evidence of multi‑factor authentication; default SSH key usage not verified. Unencrypted communications: FTP (plain or weak TLS) on Host 3. HTTP (unencrypted) on multiple ports. Actionable Insight: Enforce MFA for SSH, move HTTP services to HTTPS, and consider disabling FTP in favor of SFTP.
🌍 THREAT LANDSCAPE Geographic Risk Distribution China (Beijing & Shanghai): Host 2 (CRITICAL) and Host 3 (HIGH) both expose critical SSH vulnerabilities. Cross‑border risk: potential for coordinated attacks from Chinese infrastructure. United States (New York City): Host 1 (HIGH) with critical SSH vulnerability and non‑standard port exposure. Actionable Insight: Apply a stricter firewall policy for inbound traffic from China, especially to SSH and HTTP ports.
Service Vulnerabilities SSH (all hosts): CVE‑2023‑38408 (critical) and CVE‑2024‑6387 (high). HTTP (Host 2 & 3): 401/403 responses suggest exposed admin endpoints; potential for credential brute‑force. FTP (Host 3): TLS enabled but still vulnerable to downgrade attacks. MySQL (Host 3): No CVE listed, but exposed to the internet – high risk of credential theft. Actionable Insight: Prioritize patching SSH, move HTTP to HTTPS, disable public MySQL access, and enforce strong passwords.
🔥 PRIORITY RECOMMENDATIONS 🚨 IMMEDIATE ACTIONS (24‑48 hrs) Patch SSH services on all hosts (1.68.196.241.227, 1.92.135.168, 1.94.62.205) to the latest OpenSSH release. Block inbound traffic to non‑essential ports: Close port 11558 (Host 1). Restrict port 8082 (Host 2) to internal IPs only. Isolate Host 2 (1.92.135.168) from the network; run a full malware scan to remove Cobalt Strike. Enforce MFA for SSH logins on all hosts. Disable FTP on Host 3; switch to SFTP or secure file transfer mechanisms. ⏰ SHORT‑TERM IMPROVEMENTS (1‑2 weeks) Implement TLS for all HTTP services (port 80, 8011, 37035) and enforce HSTS. Deploy intrusion detection (e.g., Snort/Suricata) to monitor for SSH brute‑force and Cobalt Strike signatures. Update firewall rules to allow only necessary ports (22, 80, 443, 3306) from trusted IP ranges. Change default credentials on MySQL (Host 3) and enforce password complexity. Conduct a penetration test focused on SSH and HTTP endpoints to validate remediation. 📈 STRATEGIC ENHANCEMENTS (1‑3 months) Establish a patch management program with automated vulnerability scanning and remediation workflows. Implement a centralized logging and SIEM solution to correlate events across all hosts and detect lateral movement. Introduce network segmentation: isolate database servers (MySQL) and administrative services from public-facing services. Develop an incident response playbook tailored to SSH exploitation and C2 detection scenarios. Schedule regular security awareness training for administrators to recognize phishing and credential compromise attempts.

